智能论文笔记

Supervised Anomaly Detection Method Combining Generative Adversarial Networks and Three-Dimensional Data in Vehicle Inspections

Yohei Baba , Takuro Hoshi , Ryosuke Mori , Gaurang Gavai

分类：计算机视觉 | 机器学习

2022-12-22

The external visual inspections of rolling stock's underfloor equipment are currently being performed via human visual inspection. In this study, we attempt to partly automate visual inspection by investigating anomaly inspection algorithms that use image processing technology. As the railroad maintenance studies tend to have little anomaly data, unsupervised learning methods are usually preferred for anomaly detection; however, training cost and accuracy is still a challenge. Additionally, a researcher created anomalous images from normal images by adding noise, etc., but the anomalous targeted in this study is the rotation of piping cocks that was difficult to create using noise. Therefore, in this study, we propose a new method that uses style conversion via generative adversarial networks on three-dimensional computer graphics and imitates anomaly images to apply anomaly detection based on supervised learning. The geometry-consistent style conversion model was used to convert the image, and because of this the color and texture of the image were successfully made to imitate the real image while maintaining the anomalous shape. Using the generated anomaly images as supervised data, the anomaly detection model can be easily trained without complex adjustments and successfully detects anomalies.

translated by 谷歌翻译

GraphIX: Graph-based In silico XAI(explainable artificial intelligence) for drug repositioning from biopharmaceutical network

Atsuko Takagi , Mayumi Kamada , Eri Hamatani , Ryosuke Kojima , Yasushi Okuno

分类：机器学习

2022-12-21

Drug repositioning holds great promise because it can reduce the time and cost of new drug development. While drug repositioning can omit various R&D processes, confirming pharmacological effects on biomolecules is essential for application to new diseases. Biomedical explainability in a drug repositioning model can support appropriate insights in subsequent in-depth studies. However, the validity of the XAI methodology is still under debate, and the effectiveness of XAI in drug repositioning prediction applications remains unclear. In this study, we propose GraphIX, an explainable drug repositioning framework using biological networks, and quantitatively evaluate its explainability. GraphIX first learns the network weights and node features using a graph neural network from known drug indication and knowledge graph that consists of three types of nodes (but not given node type information): disease, drug, and protein. Analysis of the post-learning features showed that node types that were not known to the model beforehand are distinguished through the learning process based on the graph structure. From the learned weights and features, GraphIX then predicts the disease-drug association and calculates the contribution values of the nodes located in the neighborhood of the predicted disease and drug. We hypothesized that the neighboring protein node to which the model gave a high contribution is important in understanding the actual pharmacological effects. Quantitative evaluation of the validity of protein nodes' contribution using a real-world database showed that the high contribution proteins shown by GraphIX are reasonable as a mechanism of drug action. GraphIX is a framework for evidence-based drug discovery that can present to users new disease-drug associations and identify the protein important for understanding its pharmacological effects from a large and complex knowledge base.

translated by 谷歌翻译

P2Net: A Post-Processing Network for Refining Semantic Segmentation of LiDAR Point Cloud based on Consistency of Consecutive Frames

Yutaka Momma , Weimin Wang , Edgar Simo-Serra , Satoshi Iizuka , Ryosuke Nakamura , Hiroshi Ishikawa

分类：计算机视觉 | 机器人

2022-12-01

We present a lightweight post-processing method to refine the semantic segmentation results of point cloud sequences. Most existing methods usually segment frame by frame and encounter the inherent ambiguity of the problem: based on a measurement in a single frame, labels are sometimes difficult to predict even for humans. To remedy this problem, we propose to explicitly train a network to refine these results predicted by an existing segmentation method. The network, which we call the P2Net, learns the consistency constraints between coincident points from consecutive frames after registration. We evaluate the proposed post-processing method both qualitatively and quantitatively on the SemanticKITTI dataset that consists of real outdoor scenes. The effectiveness of the proposed method is validated by comparing the results predicted by two representative networks with and without the refinement by the post-processing network. Specifically, qualitative visualization validates the key idea that labels of the points that are difficult to predict can be corrected with P2Net. Quantitatively, overall mIoU is improved from 10.5% to 11.7% for PointNet [1] and from 10.8% to 15.9% for PointNet++ [2].

translated by 谷歌翻译

Spatiotemporal forecasting of track geometry irregularities with exogenous factors

Katsuya Kosukegawa , Yasukuni Mori , Hiroki Suyari , Kazuhiko Kawamoto

分类：机器学习 | 人工智能

2022-11-07

To ensure the safety of railroad operations, it is important to monitor and forecast track geometry irregularities. A higher safety requires forecasting with a higher spatiotemporal frequency. For forecasting with a high spatiotemporal frequency, it is necessary to capture spatial correlations. Additionally, track geometry irregularities are influenced by multiple exogenous factors. In this study, we propose a method to forecast one type of track geometry irregularity, vertical alignment, by incorporating spatial and exogenous factor calculations. The proposed method embeds exogenous factors and captures spatiotemporal correlations using a convolutional long short-term memory (ConvLSTM). In the experiment, we compared the proposed method with other methods in terms of the forecasting performance. Additionally, we conducted an ablation study on exogenous factors to examine their contribution to the forecasting performance. The results reveal that spatial calculations and maintenance record data improve the forecasting of the vertical alignment.

translated by 谷歌翻译

Recipe Generation from Unsegmented Cooking Videos

Taichi Nishimura , Atsushi Hashimoto , Yoshitaka Ushiku , Hirotaka Kameko , Shinsuke Mori

分类：自然语言处理 | 计算机视觉

2022-09-21

本文从未分割的烹饪视频中解决了食谱生成，该任务要求代理（1）提取完成盘子时提取关键事件，以及（2）为提取的事件生成句子。我们的任务类似于密集的视频字幕（DVC），该字幕旨在彻底检测事件并为其生成句子。但是，与DVC不同，在食谱生成中，食谱故事意识至关重要，模型应以正确的顺序输出适当数量的关键事件。我们分析了DVC模型的输出，并观察到，尽管（1）几个事件可作为食谱故事采用，但（2）此类事件的生成句子并未基于视觉内容。基于此，我们假设我们可以通过从DVC模型的输出事件中选择Oracle事件并为其重新生成句子来获得正确的配方。为了实现这一目标，我们提出了一种基于变压器的新型训练事件选择器和句子生成器的联合方法，用于从DVC模型的输出中选择Oracle事件并分别为事件生成接地句子。此外，我们通过包括成分来生成更准确的配方来扩展模型。实验结果表明，所提出的方法优于最先进的DVC模型。我们还确认，通过以故事感知方式对食谱进行建模，提出的模型以正确的顺序输出适当数量的事件。

translated by 谷歌翻译

A Few-shot Approach to Resume Information Extraction via Prompts

Chengguang Gan , Tatsunori Mori

分类：自然语言处理

2022-09-20

已显示迅速学习可以在大多数文本分类任务中实现近调调节性能，但很少有培训示例。对于样品稀缺的NLP任务是有利的。在本文中，我们试图将其应用于实际情况，即恢复信息提取，并增强现有方法，以使其更适用于简历信息提取任务。特别是，我们根据简历的文本特征创建了多组手动模板和语言器。此外，我们比较了蒙版语言模型（MLM）预培训语言模型（PLM）和SEQ2SEQ PLM在此任务上的性能。此外，我们改进了口头设计的设计方法，用于知识渊博的及时调整，以便为其他基于应用程序的NLP任务的迅速模板和语言设计的设计提供了示例。在这种情况下，我们提出了手动知识渊博的语言器（MKV）的概念。构造与应用程序方案相对应的知识渊博的口头表的规则。实验表明，基于我们的规则设计的模板和言语器比现有的手动模板更有效，更强大，并自动生成及时方法。已经确定，当前可用的自动提示方法无法与手动设计的及时模板竞争一些现实的任务方案。最终混淆矩阵的结果表明，我们提出的MKV显着解决了样本不平衡问题。

translated by 谷歌翻译

Visual Recipe Flow: A Dataset for Learning Visual State Changes of Objects with Recipe Flows

Keisuke Shirai , Atsushi Hashimoto , Taichi Nishimura , Hirotaka Kameko , Shuhei Kurita , Yoshitaka Ushiku , Shinsuke Mori

分类：自然语言处理 | 人工智能

2022-09-13

我们提出了一个名为“ Visual配方流”的新的多模式数据集，使我们能够学习每个烹饪动作的结果。数据集由对象状态变化和配方文本的工作流程组成。状态变化表示为图像对，而工作流则表示为食谱流图（R-FG）。图像对接地在R-FG中，该R-FG提供了交叉模式关系。使用我们的数据集，可以尝试从多模式常识推理和程序文本生成来尝试一系列应用程序。

translated by 谷歌翻译

Continuous-time Particle Filtering for Latent Stochastic Differential Equations

Ruizhi Deng , Greg Mori , Andreas M. Lehrmann

分类：机器学习

2022-09-01

粒子过滤是针对多种顺序推断任务的标准蒙特卡洛方法。粒子过滤器的关键成分是一组具有重要性权重的粒子，它们可以作为某些随机过程的真实后验分布的代理。在这项工作中，我们提出了连续的潜在粒子过滤器，该方法将粒子过滤扩展到连续时域。我们证明了如何将连续的潜在粒子过滤器用作依赖于学到的变异后验的推理技术的通用插件替换。我们对基于潜在神经随机微分方程的不同模型家族进行的实验表明，在推理任务中，连续时间粒子滤波在推理任务中的卓越性能，例如似然估计和各种随机过程的顺序预测。

translated by 谷歌翻译

Construction of English Resume Corpus and Test with Pre-trained Language Models

Chengguang Gan , Tatsunori Mori

分类：自然语言处理

2022-08-05

信息提取（IE）一直是NLP的重要任务之一。此外，信息提取的最关键应用程序方案之一是简历的信息提取。通过对简历的每个部分进行分类来获得构造的文本。存储这些文本以供以后进行搜索和分析很方便。此外，构造的简历数据也可以在AI简历筛选系统中使用。大大降低人力资源的劳动成本。这项研究旨在将简历的信息提取任务转变为简单的句子分类任务。基于先前研究生产的英语简历数据集。改进了分类规则，以创建简历的更大，更细粒度的分类数据集。该语料库还用于测试一些当前主流培训语言模型（PLMS）性能。Furthermore，为了探索培训样本数量与简历数据集的正确性率之间的关系，我们还与培训进行了比较实验一组不同的火车集尺寸。最终的多个实验结果表明，具有改进的注释规则和数据集的样本大小的简历数据集提高了原始简历数据集的准确性。

translated by 谷歌翻译

Surgical Skill Assessment via Video Semantic Aggregation

Zhenqiang Li , Lin Gu , Weimin Wang , Ryosuke Nakamura , Yoichi Sato

分类：计算机视觉

2022-08-04

基于视频的自动化手术技能评估是协助年轻的外科学员，尤其是在资源贫乏地区的一项有前途的任务。现有作品通常诉诸CNN-LSTM联合框架，该框架对LSTM的长期关系建模在空间汇总的短期CNN功能上。但是，这种做法将不可避免地忽略了空间维度中工具，组织和背景等语义概念之间的差异，从而阻碍了随后的时间关系建模。在本文中，我们提出了一个新型的技能评估框架，视频语义聚合（Visa），该框架发现了不同的语义部分，并将它们汇总在时空维度上。语义部分的明确发现提供了一种解释性的可视化，以帮助理解神经网络的决策。它还使我们能够进一步合并辅助信息，例如运动学数据，以改善表示和性能。与最新方法相比，两个数据集的实验显示了签证的竞争力。源代码可在以下网址获得：bit.ly/miccai2022visa。

translated by 谷歌翻译